Double Pareto Lognormal distribution (dpareto_lognorm)#

The double Pareto lognormal (dPlN) distribution is a continuous distribution on \((0,\infty)\) that combines:

  • a lognormal “body” (multiplicative variability / proportional growth), and

  • Pareto (power-law) tails (rare but much more frequent extremes than lognormal)

It’s a strong candidate for positive size variables that look roughly lognormal but show unusually large outliers (e.g., income/wealth, firm size, city size, file sizes, web traffic).


Learning goals#

  • Classify the distribution and interpret its parameters \((\mu,\sigma,\alpha,\beta)\).

  • Write down the PDF and CDF (and implement stable logpdf / logcdf).

  • Derive moments and understand when they are finite.

  • Implement a fast NumPy-only sampler using a simple generative representation.

  • Use SciPy’s implementation (scipy.stats.dpareto_lognorm) for inference and fitting.

import numpy as np

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

import scipy
from scipy import special
from scipy.stats import dpareto_lognorm as dpln_dist
from scipy.stats import lognorm, norm

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)

# Record versions for reproducibility (useful when numerical details matter).
VERSIONS = {"numpy": np.__version__, "scipy": scipy.__version__, "plotly": plotly.__version__}
VERSIONS
{'numpy': '1.26.2', 'scipy': '1.15.0', 'plotly': '6.5.2'}

1) Title & Classification#

  • Name: dpareto_lognorm (Double Pareto Lognormal; SciPy: scipy.stats.dpareto_lognorm)

  • Type: Continuous

  • Support (standard form): \(x \in (0,\infty)\)

  • Parameter space (standard form):

    • \(\mu \in \mathbb{R}\)

    • \(\sigma > 0\)

    • \(\alpha > 0\)

    • \(\beta > 0\)

  • SciPy shape parameters: u=\mu, s=\sigma, a=\alpha, b=\beta

  • SciPy location/scale: loc \in \mathbb{R}, scale > 0 with $\(X = \text{loc} + \text{scale}\,Y, \qquad Y \sim \mathrm{dPlN}(\mu,\sigma,\alpha,\beta).\)$

Unless stated otherwise, this notebook uses the standard form (loc=0, scale=1).

2) Intuition & Motivation#

What it models#

Many real-world positive quantities are well-modeled by a lognormal in the middle (because they arise from multiplicative effects), but have heavier tails than a lognormal:

  • too many very large observations (e.g., “winner-take-most” phenomena)

  • sometimes extra mass near 0 (many very small values)

The double Pareto lognormal addresses this by keeping a lognormal-like bulk while giving both tails power-law behavior.

A particularly useful generative representation is:

\[ \log X = Z + \frac{E_1}{\alpha} - \frac{E_2}{\beta}, \qquad Z \sim \mathcal{N}(\mu,\sigma^2),\; E_1,E_2 \sim \mathrm{Exp}(1),\; \text{independent}. \]

Equivalently (exponentiating),

\[ X = U\,\frac{V_1}{V_2}, \qquad \log U \sim \mathcal{N}(\mu,\sigma^2),\; V_1 \sim \mathrm{Pareto}(\alpha),\; V_2 \sim \mathrm{Pareto}(\beta),\; \text{independent}, \]

where SciPy’s Pareto here corresponds to support \([1,\infty)\) with tail \(\mathbb{P}(V>v)=v^{-\alpha}\).

Typical real-world use cases#

  • Income / wealth distributions (lognormal-ish center with Pareto upper tail)

  • Firm sizes, city sizes, file sizes

  • Web traffic, sales volumes, and other “size” variables with many small values and occasional extremes

Relations to other distributions#

  • Lognormal: the “body” looks lognormal when tails are not too heavy.

  • Pareto: the far right tail behaves like a Pareto with index \(\alpha\).

  • Laplace on log-scale: the term \(E_1/\alpha - E_2/\beta\) is an asymmetric Laplace random variable, so \(\log X\) is Normal + (asymmetric) Laplace.

3) Formal Definition#

Let \(\phi\) and \(\Phi\) denote the standard normal PDF and CDF:

\[ \phi(z) = rac{1}{\sqrt{2\pi}}\exp\left(- frac{1}{2}z^2 ight), \qquad \Phi(z) = \int_{-\infty}^{z} \phi(t)\,dt. \]

Define

\[ z = rac{\log x - \mu}{\sigma}, \qquad y_1 = lpha\sigma - z, \qquad y_2 = eta\sigma + z, \]

and the (normal) Mills ratio

\[ R(t) = rac{1-\Phi(t)}{\phi(t)} = rac{\Phi(-t)}{\phi(t)}. \]

PDF#

For \(x>0\),

\[ f(x;\mu,\sigma,lpha,eta) = rac{lphaeta}{(lpha+eta)\,x} \;\phi(z)\;igl(R(y_1) + R(y_2)igr). \]

CDF#

For \(x>0\),

\[ F(x;\mu,\sigma,lpha,eta) = \Phi(z) + rac{\phi(z)}{lpha+eta}\Bigl(lpha\,R(y_2) - eta\,R(y_1)\Bigr), \]

and \(F(x)=0\) for \(x\le 0\) in the standard form.

Quantile function (PPF)#

There is no simple closed form for the PPF; SciPy computes it numerically.

LOG_SQRT_2PI = 0.5 * np.log(2.0 * np.pi)


def log_phi(z: np.ndarray) -> np.ndarray:
    """Log of standard normal PDF."""
    z = np.asarray(z, dtype=float)
    return -0.5 * z**2 - LOG_SQRT_2PI


def log_Phi(z: np.ndarray) -> np.ndarray:
    """Log of standard normal CDF (stable)."""
    z = np.asarray(z, dtype=float)
    return special.log_ndtr(z)


def log_Phic(z: np.ndarray) -> np.ndarray:
    """Log of standard normal survival function 1-Φ(z) = Φ(-z) (stable)."""
    z = np.asarray(z, dtype=float)
    return special.log_ndtr(-z)


def log_R(t: np.ndarray) -> np.ndarray:
    """Log Mills ratio R(t) = (1-Φ(t))/φ(t), computed stably."""
    t = np.asarray(t, dtype=float)
    return log_Phic(t) - log_phi(t)


def log1mexp(a: np.ndarray) -> np.ndarray:
    """Compute log(1 - exp(a)) for a <= 0 in a numerically stable way."""
    a = np.asarray(a, dtype=float)
    if np.any(a > 0):
        raise ValueError("log1mexp is only defined for a <= 0")

    out = np.empty_like(a)
    cutoff = -np.log(2.0)
    mask = a < cutoff
    out[mask] = np.log1p(-np.exp(a[mask]))
    out[~mask] = np.log(-np.expm1(a[~mask]))
    return out


def dpln_logpdf(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> np.ndarray:
    """Log-PDF of the standard dPlN(μ,σ,α,β) distribution (loc=0, scale=1)."""
    x = np.asarray(x, dtype=float)
    out = np.full_like(x, fill_value=-np.inf, dtype=float)

    mask = x > 0
    xm = x[mask]

    z = (np.log(xm) - mu) / sigma
    y1 = alpha * sigma - z
    y2 = beta * sigma + z

    out[mask] = (
        np.log(alpha)
        + np.log(beta)
        - np.log(alpha + beta)
        - np.log(xm)
        + log_phi(z)
        + np.logaddexp(log_R(y1), log_R(y2))
    )
    return out


def dpln_pdf(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> np.ndarray:
    """PDF of the standard dPlN(μ,σ,α,β) distribution."""
    return np.exp(dpln_logpdf(x, mu, sigma, alpha, beta))


def dpln_logcdf(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> np.ndarray:
    """Log-CDF of the standard dPlN(μ,σ,α,β) distribution (stable)."""
    x = np.asarray(x, dtype=float)
    out = np.full_like(x, fill_value=-np.inf, dtype=float)

    mask = x > 0
    xm = x[mask]

    z = (np.log(xm) - mu) / sigma
    y1 = alpha * sigma - z
    y2 = beta * sigma + z

    t1 = log_Phi(z)
    t2 = log_phi(z)
    t3 = np.log(beta) + log_R(y1)
    t4 = np.log(alpha) + log_R(y2)

    # log(beta*R(y1) - alpha*R(y2)) with sign tracking.
    t5, sign = special.logsumexp(
        np.stack([t3, t4], axis=0),
        b=np.stack([np.ones_like(t3), -np.ones_like(t3)], axis=0),
        axis=0,
        return_sign=True,
    )

    temp = np.stack([t1, t2 + t5 - np.log(alpha + beta)], axis=0)
    out[mask] = special.logsumexp(
        temp,
        b=np.stack([np.ones_like(t1), -sign], axis=0),
        axis=0,
    )
    return out


def dpln_cdf(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> np.ndarray:
    """CDF of the standard dPlN(μ,σ,α,β) distribution."""
    return np.exp(dpln_logcdf(x, mu, sigma, alpha, beta))


def dpln_logsf(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> np.ndarray:
    """Log survival function log(1-F(x)), computed stably via log1mexp."""
    return log1mexp(dpln_logcdf(x, mu, sigma, alpha, beta))


def dpln_sf(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> np.ndarray:
    """Survival function 1-F(x)."""
    return np.exp(dpln_logsf(x, mu, sigma, alpha, beta))
# Sanity check: our formulas match SciPy.
mu, sigma, alpha, beta = 0.2, 0.7, 2.5, 1.2
x = np.logspace(-4, 4, 200)

assert np.allclose(dpln_pdf(x, mu, sigma, alpha, beta), dpln_dist.pdf(x, mu, sigma, alpha, beta))
assert np.allclose(dpln_cdf(x, mu, sigma, alpha, beta), dpln_dist.cdf(x, mu, sigma, alpha, beta))
assert np.allclose(dpln_logpdf(x, mu, sigma, alpha, beta), dpln_dist.logpdf(x, mu, sigma, alpha, beta))
assert np.allclose(dpln_logcdf(x, mu, sigma, alpha, beta), dpln_dist.logcdf(x, mu, sigma, alpha, beta))

4) Moments & Properties#

Tail behavior (why “double Pareto”)#

The distribution has power-law tails:

  • As \(x\to\infty\), the right tail behaves like $\(\mathbb{P}(X>x) \sim C\,x^{-\alpha},\)\( so the PDF decays like \)x^{-\alpha-1}$.

  • As \(x\to 0^+\), the left tail behaves like $\(\mathbb{P}(X\le x) \sim C\,x^{\beta},\)\( so the PDF behaves like \)x^{\beta-1}$ near 0.

This is why \(\alpha\) controls upper extremes, while \(\beta\) controls how much mass sits near very small values.

Mean, variance, skewness, kurtosis#

All positive moments have a clean form. For \(k<\alpha\),

\[ \mathbb{E}[X^k] = \frac{\alpha\beta}{(\alpha-k)(\beta+k)}\;\exp\left(k\mu + \tfrac{1}{2}k^2\sigma^2\right). \]

Consequences:

  • Mean exists iff \(\alpha>1\).

  • Variance exists iff \(\alpha>2\).

  • Skewness exists iff \(\alpha>3\).

  • Kurtosis exists iff \(\alpha>4\).

(Similarly, negative moments exist up to order \(\beta\).)

MGF / characteristic function#

  • The MGF \(M_X(t)=\mathbb{E}[e^{tX}]\) does not exist for any \(t>0\) because the right tail is polynomial.

  • The Laplace transform exists for \(t<0\).

  • The characteristic function \(\varphi_X(t)=\mathbb{E}[e^{itX}]\) exists for all real \(t\) (bounded integrand), but there is no simple elementary closed form.

Entropy#

The differential entropy \(h(X)=-\mathbb{E}[\log f(X)]\) is finite for typical parameters and is available via scipy.stats.dpareto_lognorm.entropy (computed numerically).

def dpln_raw_moment(k: float, mu: float, sigma: float, alpha: float, beta: float) -> float:
    """Raw moment E[X^k] for k < alpha (standard form)."""
    if alpha <= k:
        return np.nan
    return (alpha * beta) / ((alpha - k) * (beta + k)) * np.exp(k * mu + 0.5 * (k * sigma) ** 2)


def dpln_mvsk_from_raw(mu: float, sigma: float, alpha: float, beta: float):
    """Return (mean, var, skew, kurt_excess) when moments exist; else NaNs."""
    m1 = dpln_raw_moment(1, mu, sigma, alpha, beta)
    m2 = dpln_raw_moment(2, mu, sigma, alpha, beta)
    m3 = dpln_raw_moment(3, mu, sigma, alpha, beta)
    m4 = dpln_raw_moment(4, mu, sigma, alpha, beta)

    mean = m1
    var = m2 - m1**2

    # Central moments via raw moments.
    mu3 = m3 - 3 * m2 * m1 + 2 * m1**3
    mu4 = m4 - 4 * m3 * m1 + 6 * m2 * m1**2 - 3 * m1**4

    skew = mu3 / (var ** 1.5)
    kurt_excess = mu4 / (var**2) - 3.0

    return mean, var, skew, kurt_excess


mu, sigma, alpha, beta = 0.2, 0.6, 6.0, 2.0

mean_f, var_f, skew_f, kurt_f = dpln_mvsk_from_raw(mu, sigma, alpha, beta)
mean_s, var_s, skew_s, kurt_s = dpln_dist.stats(mu, sigma, alpha, beta, moments="mvsk")
entropy_s = dpln_dist.entropy(mu, sigma, alpha, beta)

print("Moments from closed-form raw moments:")
print("  mean   =", mean_f)
print("  var    =", var_f)
print("  skew   =", skew_f)
print("  kurt   =", kurt_f)

print()
print("Moments from SciPy:")
print("  mean   =", mean_s)
print("  var    =", var_s)
print("  skew   =", skew_s)
print("  kurt   =", kurt_s)
print("  entropy=", entropy_s)

# Monte Carlo check (should be close when moments are finite).
N = 80_000
x_mc = dpln_dist.rvs(mu, sigma, alpha, beta, size=N, random_state=rng)

print()
print("Monte Carlo (N=80k):")
print("  mean ≈", x_mc.mean())
print("  var  ≈", x_mc.var())
print("  entropy ≈", -dpln_dist.logpdf(x_mc, mu, sigma, alpha, beta).mean())
Moments from closed-form raw moments:
  mean   = 1.1698276715473797
  var    = 0.9301438713517878
  skew   = 2.787565892454241
  kurt   = 18.307973079452186

Moments from SciPy:
  mean   = 1.1698276715473797
  var    = 0.9301438713517878
  skew   = 2.7875658924542424
  kurt   = 18.307973079452186
  entropy= 1.0436993462234498

Monte Carlo (N=80k):
  mean ≈ 1.1713176076871294
  var  ≈ 0.9315080413645616
  entropy ≈ 1.044800564368963

5) Parameter Interpretation#

  • \(\mu\) (log-location) shifts the distribution multiplicatively: increasing \(\mu\) scales typical values by \(e^{\Delta\mu}\).

  • \(\sigma\) (log-scale) increases spread in the lognormal “body” and also amplifies variability in both tails.

  • \(\alpha\) (right-tail index) controls how fast the upper tail decays:

    • smaller \(\alpha\) → heavier right tail (more extremes)

    • moments exist only up to order \(<\alpha\)

  • \(\beta\) (left-tail index) controls behavior near 0:

    • smaller \(\beta\) → more mass near 0 (and stronger divergence of the density as \(x\to 0^+\) if \(\beta<1\))

Below we visualize how the PDF changes when varying one parameter at a time.

# Parameter sweeps: how the PDF changes.

x = np.logspace(-3, 3, 600)

base = dict(mu=0.0, sigma=0.6, alpha=3.0, beta=2.0)

# 1) Vary alpha (right tail)
fig_alpha = go.Figure()
for a in [1.5, 3.0, 8.0]:
    y = dpln_dist.pdf(x, base["mu"], base["sigma"], a, base["beta"])
    fig_alpha.add_trace(go.Scatter(x=x, y=y, mode="lines", name=f"alpha={a}"))
fig_alpha.update_xaxes(type="log", title="x")
fig_alpha.update_yaxes(type="log", title="pdf(x)")
fig_alpha.update_layout(title="Effect of alpha: right-tail heaviness")
fig_alpha.show()

# 2) Vary beta (left tail)
fig_beta = go.Figure()
for b in [0.5, 1.5, 4.0]:
    y = dpln_dist.pdf(x, base["mu"], base["sigma"], base["alpha"], b)
    fig_beta.add_trace(go.Scatter(x=x, y=y, mode="lines", name=f"beta={b}"))
fig_beta.update_xaxes(type="log", title="x")
fig_beta.update_yaxes(type="log", title="pdf(x)")
fig_beta.update_layout(title="Effect of beta: mass near 0")
fig_beta.show()

# 3) Vary sigma (lognormal spread)
fig_sigma = go.Figure()
for s in [0.2, 0.6, 1.0]:
    y = dpln_dist.pdf(x, base["mu"], s, base["alpha"], base["beta"])
    fig_sigma.add_trace(go.Scatter(x=x, y=y, mode="lines", name=f"sigma={s}"))
fig_sigma.update_xaxes(type="log", title="x")
fig_sigma.update_yaxes(type="log", title="pdf(x)")
fig_sigma.update_layout(title="Effect of sigma: lognormal-body spread")
fig_sigma.show()

6) Derivations#

6.1 Expectation (general moment)#

Using the generative form

\[ \log X = Z + \frac{E_1}{\alpha} - \frac{E_2}{\beta}, \qquad Z\perp E_1\perp E_2, \]

we get for any real \(k\) (when the expectations exist)

\[ X^k = \exp(kZ)\,\exp\left(\frac{kE_1}{\alpha}\right)\,\exp\left(-\frac{kE_2}{\beta}\right). \]

Independence implies

\[ \mathbb{E}[X^k] = \mathbb{E}[e^{kZ}]\,\mathbb{E}\left[e^{kE_1/\alpha}\right]\,\mathbb{E}\left[e^{-kE_2/\beta}\right]. \]

Now:

  • If \(Z\sim\mathcal{N}(\mu,\sigma^2)\) then \(\mathbb{E}[e^{kZ}] = \exp\left(k\mu + \tfrac{1}{2}k^2\sigma^2\right)\).

  • If \(E\sim\mathrm{Exp}(1)\) then \(\mathbb{E}[e^{tE}] = \frac{1}{1-t}\) for \(t<1\).

Therefore, for \(k<\alpha\),

\[ \mathbb{E}\left[e^{kE_1/\alpha}\right] = \frac{1}{1-k/\alpha} = \frac{\alpha}{\alpha-k}, \qquad \mathbb{E}\left[e^{-kE_2/\beta}\right] = \frac{1}{1+k/\beta} = \frac{\beta}{\beta+k}, \]

and multiplying gives

\[ \mathbb{E}[X^k] = \frac{\alpha\beta}{(\alpha-k)(\beta+k)}\;\exp\left(k\mu + \tfrac{1}{2}k^2\sigma^2\right). \]

Setting \(k=1\) yields the mean (when \(\alpha>1\)).

6.2 Variance#

When \(\alpha>2\),

\[ \mathrm{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2, \]

using the moment formula with \(k=1\) and \(k=2\).

6.3 Likelihood (i.i.d. sample)#

For i.i.d. data \(x_1,\ldots,x_n\) (all \(>0\)), the log-likelihood is

\[ \ell(\mu,\sigma,\alpha,\beta\mid x_{1:n}) = \sum_{i=1}^n \log f(x_i;\mu,\sigma,\alpha,\beta). \]

A convenient stable form (using the definitions from Section 3) is

\[ \log f(x_i) = \log\alpha + \log\beta - \log(\alpha+\beta) - \log x_i + \log\phi(z_i) + \log\bigl( R(y_{1,i}) + R(y_{2,i}) \bigr), \]

where the last term is best computed with a logaddexp (log-sum-exp) trick.

def dpln_loglik(x: np.ndarray, mu: float, sigma: float, alpha: float, beta: float) -> float:
    x = np.asarray(x, dtype=float)
    return float(dpln_logpdf(x, mu, sigma, alpha, beta).sum())


# Quick check: likelihood prefers the true parameters on synthetic data.
mu0, sigma0, alpha0, beta0 = 0.1, 0.7, 3.5, 2.0
x = dpln_dist.rvs(mu0, sigma0, alpha0, beta0, size=2_000, random_state=rng)

ll_true = dpln_loglik(x, mu0, sigma0, alpha0, beta0)
ll_perturbed = dpln_loglik(x, mu0 + 0.3, sigma0 * 1.1, alpha0 * 0.8, beta0 * 1.3)

print("log-likelihood at true params     =", ll_true)
print("log-likelihood at perturbed params=", ll_perturbed)
log-likelihood at true params     = -2294.9214816192944
log-likelihood at perturbed params= -2600.229763380172

7) Sampling & Simulation#

NumPy-only algorithm#

Using the representation

\[ \log X = Z + \frac{E_1}{\alpha} - \frac{E_2}{\beta} \]

with \(Z\sim\mathcal{N}(\mu,\sigma^2)\) and \(E_1,E_2\sim\mathrm{Exp}(1)\) independent, sampling is straightforward:

  1. Draw \(Z\) from a normal.

  2. Draw \(E_1,E_2\) from independent exponentials.

  3. Return \(X=\exp\left(Z + E_1/\alpha - E_2/\beta\right)\).

This is exactly the approach used internally by SciPy.

def dpln_rvs_numpy(
    mu: float,
    sigma: float,
    alpha: float,
    beta: float,
    size: int | tuple[int, ...],
    rng: np.random.Generator | None = None,
) -> np.ndarray:
    """NumPy-only sampler for the standard dPlN(μ,σ,α,β) distribution."""
    if rng is None:
        rng = np.random.default_rng()

    Z = rng.normal(mu, sigma, size=size)
    E1 = rng.standard_exponential(size=size)
    E2 = rng.standard_exponential(size=size)
    return np.exp(Z + E1 / alpha - E2 / beta)


mu, sigma, alpha, beta = 0.0, 0.6, 3.0, 2.0

x_np = dpln_rvs_numpy(mu, sigma, alpha, beta, size=200_000, rng=rng)
x_sp = dpln_dist.rvs(mu, sigma, alpha, beta, size=200_000, random_state=rng)

print("NumPy sampler vs SciPy sampler (same RNG stream not expected to match):")
print("  mean NumPy:", x_np.mean())
print("  mean SciPy:", x_sp.mean())
print("  median NumPy:", np.median(x_np))
print("  median SciPy:", np.median(x_sp))
NumPy sampler vs SciPy sampler (same RNG stream not expected to match):
  mean NumPy: 1.1982726939077264
  mean SciPy: 1.1967383826043556
  median NumPy: 0.8688818370335867
  median SciPy: 0.8725882969800218

8) Visualization#

We’ll look at:

  • the PDF (and Monte Carlo histogram)

  • the CDF (and an empirical CDF)

  • the tail on a log–log plot to reveal the Pareto slope

mu, sigma, alpha, beta = 0.0, 0.6, 3.0, 2.0

samples = dpln_rvs_numpy(mu, sigma, alpha, beta, size=60_000, rng=rng)

# Focus on the body up to a high quantile; tails are shown separately.
x_max = np.quantile(samples, 0.995)
x_grid = np.logspace(-3, np.log10(x_max), 600)

pdf_grid = dpln_dist.pdf(x_grid, mu, sigma, alpha, beta)

fig = go.Figure()
fig.add_histogram(
    x=samples,
    histnorm="probability density",
    nbinsx=140,
    name="Monte Carlo",
    opacity=0.55,
)
fig.add_trace(
    go.Scatter(x=x_grid, y=pdf_grid, mode="lines", name="PDF (theory)", line=dict(width=3))
)

fig.update_xaxes(type="log", title="x", range=[np.log10(1e-3), np.log10(x_max)])
fig.update_yaxes(title="density")
fig.update_layout(title="dpareto_lognorm: PDF + Monte Carlo histogram")
fig.show()
# CDF + empirical CDF

x_grid = np.logspace(-3, 3, 700)
cdf_grid = dpln_dist.cdf(x_grid, mu, sigma, alpha, beta)

xs = np.sort(samples)
ecdf = np.arange(1, xs.size + 1) / xs.size

fig = go.Figure()
fig.add_trace(go.Scatter(x=x_grid, y=cdf_grid, mode="lines", name="CDF (theory)", line=dict(width=3)))
fig.add_trace(
    go.Scatter(x=xs, y=ecdf, mode="lines", name="empirical CDF", line=dict(width=2, dash="dot"))
)

fig.update_xaxes(type="log", title="x")
fig.update_yaxes(title="F(x)")
fig.update_layout(title="dpareto_lognorm: CDF vs empirical CDF")
fig.show()
# Tail plot: survival function on log-log scale

x_tail = np.logspace(0, 4, 200)
sf_tail = dpln_dist.sf(x_tail, mu, sigma, alpha, beta)

fig = go.Figure()
fig.add_trace(go.Scatter(x=x_tail, y=sf_tail, mode="lines", name="SF(x)=P(X>x)"))

# Reference slope ~ x^{-alpha}
C = sf_tail[0] * (x_tail[0] ** alpha)
ref = C * x_tail ** (-alpha)
fig.add_trace(
    go.Scatter(x=x_tail, y=ref, mode="lines", name=f"reference ~ x^(-{alpha})", line=dict(dash="dash"))
)

fig.update_xaxes(type="log", title="x")
fig.update_yaxes(type="log", title="P(X>x)")
fig.update_layout(title="Right tail looks Pareto on log-log axes")
fig.show()

9) SciPy Integration#

SciPy provides scipy.stats.dpareto_lognorm with the usual distribution API:

  • pdf, logpdf

  • cdf, logcdf

  • sf, logsf

  • rvs

  • fit (MLE)

You can also freeze a distribution with specific parameters to get a reusable object.

# Frozen distribution
mu, sigma, alpha, beta = 0.1, 0.7, 3.5, 2.0

dist = dpln_dist(mu, sigma, alpha, beta)

x = np.logspace(-2, 3, 6)
print("pdf:", dist.pdf(x))
print("cdf:", dist.cdf(x))
print("rvs:", dist.rvs(size=5, random_state=rng))

# Fitting example (MLE) on synthetic data.
true_params = (0.1, 0.7, 3.5, 2.0)
data = dpln_dist.rvs(*true_params, size=5_000, random_state=rng)

# Fix loc=0 and scale=1 to estimate only (u,s,a,b) in the standard form.
(u_hat, s_hat, a_hat, b_hat, loc_hat, scale_hat) = dpln_dist.fit(data, floc=0, fscale=1)

print()
print("True (u,s,a,b):", true_params)
print("Fit  (u,s,a,b):", (u_hat, s_hat, a_hat, b_hat))
print("(loc, scale) fixed:", (loc_hat, scale_hat))
pdf: [0.027764 0.274118 0.462581 0.000945 0.       0.      ]
cdf: [0.000139 0.013823 0.538674 0.997068 0.999999 1.      ]
rvs: [3.412395 7.857961 2.926468 0.45226  0.671159]
True (u,s,a,b): (0.1, 0.7, 3.5, 2.0)
Fit  (u,s,a,b): (0.10183273702285217, 0.7123237885111698, 3.6887988783026655, 2.1678064142366296)
(loc, scale) fixed: (0, 1)
/home/tempa/miniconda3/lib/python3.12/site-packages/scipy/stats/_continuous_distns.py:1963: RuntimeWarning:

divide by zero encountered in divide

10) Statistical Use Cases#

10.1 Hypothesis testing (goodness-of-fit)#

If parameters are specified in advance (not estimated on the same data), you can test whether data plausibly comes from a dPlN distribution using goodness-of-fit tests like Kolmogorov–Smirnov (KS).

Caveat: if you estimate parameters from the data and then run KS on the same data, usual p-values are not exact (you’d want a corrected procedure or a bootstrap).

10.2 Bayesian modeling#

Because we can evaluate the likelihood \(p(x\mid\mu,\sigma,\alpha,\beta)\), we can place priors on parameters and do Bayesian inference:

\[ p(\theta\mid x_{1:n}) \propto p(x_{1:n}\mid \theta)\,p(\theta), \qquad \theta=(\mu,\sigma,\alpha,\beta). \]

Below we show a simple 1D example: infer \(\mu\) with \((\sigma,\alpha,\beta)\) held fixed.

10.3 Generative modeling#

In simulation or synthetic-data generation, dPlN is a convenient way to generate lognormal-like samples with realistic heavy tails.

# Hypothesis testing example: KS test when parameters are known.
from scipy.stats import kstest

mu, sigma, alpha, beta = 0.0, 0.6, 3.0, 2.0
x = dpln_dist.rvs(mu, sigma, alpha, beta, size=3_000, random_state=rng)

D, p_value = kstest(x, dpln_dist(mu, sigma, alpha, beta).cdf)
print("KS test against dPlN(u=0, s=0.6, a=3, b=2):")
print("  D      =", D)
print("  p-value=", p_value)

# Compare against a (mis-specified) lognormal with the same fitted parameters.
shape_ln, loc_ln, scale_ln = lognorm.fit(x, floc=0)
D_ln, p_ln = kstest(x, lognorm(shape_ln, loc_ln, scale_ln).cdf)
print()
print("KS test against fitted lognormal (p-value not exact due to fitting):")
print("  D      =", D_ln)
print("  p-value=", p_ln)
KS test against dPlN(u=0, s=0.6, a=3, b=2):
  D      = 0.01289957414019094
  p-value= 0.6952763982788772

KS test against fitted lognormal (p-value not exact due to fitting):
  D      = 0.03175003284745531
  p-value= 0.004620185724241905
# Model comparison via AIC: dPlN vs lognormal on the same data.

# Fit dPlN (u,s,a,b) with loc/scale fixed.
(u_hat, s_hat, a_hat, b_hat, _, _) = dpln_dist.fit(x, floc=0, fscale=1)
ll_dpln = dpln_dist.logpdf(x, u_hat, s_hat, a_hat, b_hat).sum()

# Fit lognormal (shape, scale) with loc fixed.
(shape_ln, _, scale_ln) = lognorm.fit(x, floc=0)
ll_lognorm = lognorm.logpdf(x, shape_ln, loc=0, scale=scale_ln).sum()

k_dpln = 4  # u, s, a, b
k_lognorm = 2  # shape (sigma), scale (exp(mu))

AIC_dpln = 2 * k_dpln - 2 * ll_dpln
AIC_lognorm = 2 * k_lognorm - 2 * ll_lognorm

print("AIC (lower is better):")
print("  dPlN     =", AIC_dpln)
print("  lognormal=", AIC_lognorm)
AIC (lower is better):
  dPlN     = 6550.366513886966
  lognormal= 6654.193797013038
# Bayesian example: posterior for mu with (sigma, alpha, beta) fixed.

mu_true, sigma, alpha, beta = 0.0, 0.6, 3.0, 2.0
x = dpln_dist.rvs(mu_true, sigma, alpha, beta, size=600, random_state=rng)

mu_grid = np.linspace(-1.0, 1.0, 600)

# Prior: mu ~ Normal(0, 0.7^2)
mu0, tau = 0.0, 0.7
log_prior = norm.logpdf(mu_grid, loc=mu0, scale=tau)

# Vectorized log-likelihood over the grid.
log_like = dpln_dist.logpdf(x[:, None], mu_grid[None, :], sigma, alpha, beta).sum(axis=0)

log_post_unnorm = log_prior + log_like
log_post = log_post_unnorm - special.logsumexp(log_post_unnorm)
post = np.exp(log_post)

fig = go.Figure()
fig.add_trace(go.Scatter(x=mu_grid, y=post, mode="lines", name="posterior p(mu|x)"))
fig.add_vline(x=mu_true, line_dash="dash", line_color="black", annotation_text="true mu")
fig.update_xaxes(title="mu")
fig.update_yaxes(title="posterior density")
fig.update_layout(title="Bayesian inference for mu (sigma, alpha, beta fixed)")
fig.show()
# Generative modeling: how alpha changes extremes (same mu,sigma,beta).

mu, sigma, beta = 0.0, 0.6, 2.0

alphas = [1.5, 3.0, 8.0]
N = 50_000

summary = []
for a in alphas:
    xs = dpln_dist.rvs(mu, sigma, a, beta, size=N, random_state=rng)
    summary.append(
        {
            "alpha": a,
            "median": float(np.median(xs)),
            "q95": float(np.quantile(xs, 0.95)),
            "q99": float(np.quantile(xs, 0.99)),
            "max": float(xs.max()),
        }
    )

summary
[{'alpha': 1.5,
  'median': 1.140906784453676,
  'q95': 6.5539715901899305,
  'q99': 19.640716249820006,
  'max': 2277.2901435545964},
 {'alpha': 3.0,
  'median': 0.8734973177090273,
  'q95': 3.1817893858567854,
  'q99': 5.771254078804477,
  'max': 88.8552624536584},
 {'alpha': 8.0,
  'median': 0.7198174084958109,
  'q95': 2.2711930633331923,
  'q99': 3.5614201502442215,
  'max': 15.600886798001836}]

11) Pitfalls#

  • Parameter constraints: require \(\sigma>0\), \(\alpha>0\), \(\beta>0\). The support is \(x>0\) in the standard form.

  • Moment existence: mean/variance/skewness/kurtosis require \(\alpha>1,2,3,4\) respectively; otherwise these summaries are infinite/undefined.

  • Numerical stability:

    • tails can underflow in pdf / cdf; prefer logpdf, logcdf, logsf.

    • computing \(R(t)=(1-\Phi(t))/\phi(t)\) naively can overflow/underflow; use stable log-space computations.

  • Fitting (MLE):

    • heavy tails can make optimization sensitive; check diagnostics (Q–Q plots, tail plots).

    • parameters can trade off (e.g., \(\sigma\) vs tail indices), so estimates can be noisy with limited data.

  • KS test after fitting: standard p-values are not exact when parameters are estimated from the same sample.

12) Summary#

  • dpareto_lognorm is a continuous distribution on \((0,\infty)\) with a lognormal-like bulk and Pareto tails.

  • The tail index \(\alpha\) controls upper extremes and determines which positive moments exist.

  • Moments have a clean closed form via the representation \(\log X = Z + E_1/\alpha - E_2/\beta\).

  • Sampling is easy with NumPy (normal + exponentials), and SciPy provides a full API including fit.

References#

  • Reed, W. J., & Jorgensen, M. (2004). The double Pareto-lognormal distribution — a new parametric model for size distributions. Communications in Statistics — Theory and Methods.

  • Hajargasht, G., & Griffiths, W. E. (2013). Pareto-lognormal distributions: Inequality, poverty, and estimation from grouped income data. Economic Modelling.

  • SciPy documentation: scipy.stats.dpareto_lognorm.